AT&T at TREC-7 SDR Track

نویسندگان

  • Amit Singhal
  • John Choi
  • Donald Hindle
  • Julia Hirschberg
  • Fernando Pereira
  • Steve Whittaker
چکیده

AT&T participated in the Spoken Document Retrieval (SDR) track of TREC-7. Our speech retrieval system uses modern Information Retrieval (IR) methods in conjunction with in-house automatic speech recognition. The novel feature of our TREC-7 work is the use of document expansion to reduce the performance loss due to ASR errors. Results show that retrieval from automatic transcriptions of speech is quite competitive with doing retrieval from human transcriptions. Our experiments indicate that document expansion can be used to further improve retrieval from automatic transcripts. This paper presents some analysis of document expansion in context of the TREC-7 SDR track task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AT&T at TREC-7

This year AT&T participated in the ad-hoc task and the Filtering, SDR, and VLC tracks. Most of our eeort for TREC-7 was concentrated on SDR and VLC tracks. On the ltering track, we tested a preliminary version of a text classiication toolkit that we have been developing over the last year. In the ad-hoc task, we introduce a new tf-factor in our term weighting scheme and use a simpliied retrieva...

متن کامل

TREC-6 1997 Spoken Document Retrieval Track Overview and Results

This paper describes the 1997 TREC-6 Spoken Document Retrieval (SDR) Track which implemented a first evaluation of retrieval of broadcast news excerpts using a combination of automatic speech recognition and information retrieval technologies. The motivations behind the SDR Track and background regarding its development and implementation are discussed. The SDR evaluation collection and topics ...

متن کامل

1998 TREC-7 Spoken Document Retrieval Track Overview and Results

This paper describes the 1998 TREC-7 Spoken Document Retrieval (SDR) Track which implemented an evaluation of retrieval of broadcast news excerpts using a combination of automatic speech recognition and information retrieval technologies. The motivations behind the SDR Track and background regarding its development and implementation are discussed. The SDR evaluation collection and topics are d...

متن کامل

AT&T at TREC-8

In 1999, AT&T participated in the ad-hoc task and the Question Answering (QA), Spoken Document Retrieval (SDR), and Web tracks. Most of our e ort for TREC-8 focused on the QA and SDR tracks. Results from SDR track show that our document expansion techniques, presented in [8, 9], are very e ective for speech retrieval. The results for question answering are also encouraging. Our system designed ...

متن کامل

AT&T at TREC-6

TREC-6 is AT&T's rst independent TREC participation. We are participating in the main tasks (adhoc, routing), the ltering track, the VLC track, and the SDR track 1 This year, in the main tasks, we experimented with multi-pass query expansion using Rocchio's formulation. We concentrated a reasonable amount of our eeort on our VLC track system, which is based on locally distributed, disjoint, and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997